Search CORE

54 research outputs found

Constructive Preference Elicitation over Hybrid Combinatorial Spaces

Author: Dragone Paolo
Passerini Andrea
Teso Stefano
Publication venue
Publication date: 01/02/2018
Field of study

Preference elicitation is the task of suggesting a highly preferred configuration to a decision maker. The preferences are typically learned by querying the user for choice feedback over pairs or sets of objects. In its constructive variant, new objects are synthesized "from scratch" by maximizing an estimate of the user utility over a combinatorial (possibly infinite) space of candidates. In the constructive setting, most existing elicitation techniques fail because they rely on exhaustive enumeration of the candidates. A previous solution explicitly designed for constructive tasks comes with no formal performance guarantees, and can be very expensive in (or unapplicable to) problems with non-Boolean attributes. We propose the Choice Perceptron, a Perceptron-like algorithm for learning user preferences from set-wise choice feedback over constructive domains and hybrid Boolean-numeric feature spaces. We provide a theoretical analysis on the attained regret that holds for a large class of query selection strategies, and devise a heuristic strategy that aims at optimizing the regret in practice. Finally, we demonstrate its effectiveness by empirical evaluation against existing competitors on constructive scenarios of increasing complexity.Comment: AAAI 2018, computing methodologies, machine learning, learning paradigms, supervised learning, structured output

arXiv.org e-Print Archive

Lirias

Association for the Advancement of Artificial Intelligence: AAAI Publications

Decomposition Strategies for Constructive Preference Elicitation

Author: Dragone Paolo
Kumar Mohit
Passerini Andrea
Teso Stefano
Publication venue
Publication date: 02/02/2018
Field of study

We tackle the problem of constructive preference elicitation, that is the problem of learning user preferences over very large decision problems, involving a combinatorial space of possible outcomes. In this setting, the suggested configuration is synthesized on-the-fly by solving a constrained optimization problem, while the preferences are learned itera tively by interacting with the user. Previous work has shown that Coactive Learning is a suitable method for learning user preferences in constructive scenarios. In Coactive Learning the user provides feedback to the algorithm in the form of an improvement to a suggested configuration. When the problem involves many decision variables and constraints, this type of interaction poses a significant cognitive burden on the user. We propose a decomposition technique for large preference-based decision problems relying exclusively on inference and feedback over partial configurations. This has the clear advantage of drastically reducing the user cognitive load. Additionally, part-wise inference can be (up to exponentially) less computationally demanding than inference over full configurations. We discuss the theoretical implications of working with parts and present promising empirical results on one synthetic and two realistic constructive problems.Comment: Accepted at the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18

arXiv.org e-Print Archive

Lirias

Association for the Advancement of Artificial Intelligence: AAAI Publications

Statistical Relational Learning for Proteomics: Function, Interactions and Evolution

Author: Teso Stefano
Publication venue: University of Trento
Publication date: 19/12/2013
Field of study

In recent years, the field of Statistical Relational Learning (SRL) [1, 2] has produced new, powerful learning methods that are explicitly designed to solve complex problems, such as collective classification, multi-task learning and structured output prediction, which natively handle relational data, noise, and partial information. Statistical-relational methods rely on some First- Order Logic as a general, expressive formal language to encode both the data instances and the relations or constraints between them. The latter encode background knowledge on the problem domain, and are use to restrict or bias the model search space according to the instructions of domain experts. The new tools developed within SRL allow to revisit old computational biology problems in a less ad hoc fashion, and to tackle novel, more complex ones. Motivated by these developments, in this thesis we describe and discuss the application of SRL to three important biological problems, highlighting the advantages, discussing the trade-offs, and pointing out the open problems. In particular, in Chapter 3 we show how to jointly improve the outputs of multiple correlated predictors of protein features by means of a very gen- eral probabilistic-logical consistency layer. The logical layer — based on grounding-specific Markov Logic networks [3] — enforces a set of weighted first-order rules encoding biologically motivated constraints between the pre- dictions. The refiner then improves the raw predictions so that they least violate the constraints. Contrary to canonical methods for the prediction of protein features, which typically take predicted correlated features as in- puts to improve the output post facto, our method can jointly refine all predictions together, with potential gains in overall consistency. In order to showcase our method, we integrate three stand-alone predictors of corre- lated features, namely subcellular localization (Loctree[4]), disulfide bonding state (Disulfind[5]), and metal bonding state (MetalDetector[6]), in a way that takes into account the respective strengths and weaknesses. The ex- perimental results show that the refiner can improve the performance of the underlying predictors by removing rule violations. In addition, the proposed method is fully general, and could in principle be applied to an array of heterogeneous predictions without requiring any change to the underlying software. In Chapter 4 we consider the multi-level protein–protein interaction (PPI) prediction problem. In general, PPIs can be seen as a hierarchical process occurring at three related levels: proteins bind by means of specific domains, which in turn form interfaces through patches of residues. Detailed knowl- edge about which domains and residues are involved in a given interaction has extensive applications to biology, including better understanding of the bind- ing process and more efficient drug/enzyme design. We cast the prediction problem in terms of multi-task learning, with one task per level (proteins, domains and residues), and propose a machine learning method that collec- tively infers the binding state of all object pairs, at all levels, concurrently. Our method is based on Semantic Based Regularization (SBR) [7], a flexible and theoretically sound SRL framework that employs First-Order Logic con- straints to tie the learning tasks together. Contrarily to most current PPI prediction methods, which neither identify which regions of a protein actu- ally instantiate an interaction nor leverage the hierarchy of predictions, our method resolves the prediction problem up to residue level, enforcing con- sistent predictions between the hierarchy levels, and fruitfully exploits the hierarchical nature of the problem. We present numerical results showing that our method substantially outperforms the baseline in several experi- mental settings, indicating that our multi-level formulation can indeed lead to better predictions. Finally, in Chapter 5 we consider the problem of predicting drug-resistant protein mutations through a combination of Inductive Logic Programming [8, 9] and Statistical Relational Learning. In particular, we focus on viral pro- teins: viruses are typically characterized by high mutation rates, which allow them to quickly develop drug-resistant mutations. Mining relevant rules from mutation data can be extremely useful to understand the virus adaptation mechanism and to design drugs that effectively counter potentially resistant mutants. We propose a simple approach for mutant prediction where the in- put consists of mutation data with drug-resistance information, either as sets of mutations conferring resistance to a certain drug, or as sets of mutants with information on their susceptibility to the drug. The algorithm learns a set of relational rules characterizing drug-resistance, and uses them to generate a set of potentially resistant mutants. Learning a weighted combination of rules allows to attach generated mutants with a resistance score as predicted by the statistical relational model and select only the highest scoring ones. Promising results were obtained in generating resistant mutations for both nucleoside and non-nucleoside HIV reverse transcriptase inhibitors. The ap- proach can be generalized quite easily to learning mutants characterized by more complex rules correlating multiple mutations

Unitn-eprints PhD